Parameter Database : Data-centric Synchronization for Scalable Machine Learning
نویسندگان
چکیده
We propose a new data-centric synchronization framework for carrying out of machine learning (ML) tasks in a distributed environment. Our framework exploits the iterative nature of ML algorithms and relaxes the application agnostic bulk synchronization parallel (BSP) paradigm that has previously been used for distributed machine learning. Data-centric synchronization complements function-centric synchronization based on using stale updates to increase the throughput of distributed ML computations. Experiments to validate our framework suggest that we can attain substantial improvement over BSP while guaranteeing sequential correctness of ML tasks.
منابع مشابه
Factorized Databases: Past and Future Past
In this talk I will overview the FDB project at Oxford on succinct, lossless representations of relational data that I call factorized databases. I will first present a characterization of the succinctness of results to conjunctive queries and how factorizations can speed up query processing.I will then comment on how this succinctness characterization relates to seemingly disparate results on:...
متن کاملAccess control in ultra-large-scale systems using a data-centric middleware
The primary characteristic of an Ultra-Large-Scale (ULS) system is ultra-large size on any related dimension. A ULS system is generally considered as a system-of-systems with heterogeneous nodes and autonomous domains. As the size of a system-of-systems grows, and interoperability demand between sub-systems is increased, achieving more scalable and dynamic access control system becomes an im...
متن کاملPivotalR: A Package for Machine Learning on Big Data
PivotalR [1] is an R package that provides a front-end to PostgreSQL [2] and all PostgreSQL-like databases such as Pivotal Inc.'s Greenplum Database (GPDB) [3], HAWQ [4] on Hadoop. PivotalR also provides the R wrapper for MADlib [5]. MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning al...
متن کاملA Speech Driven Face Animation System Based on Machine Learning
Lip synchronization is the key issue in speech driven face animation system. In this paper, some clustering and machine learning methods are combined together to estimate face animation parameters from audio sequences and then apply the learning results to MPEG-4 based speech driven face animation system. Based on a large recorded audio-visual database, an unsupervised cluster algorithm is prop...
متن کاملDatabase Establishment for Machine Learning in NILM
Nonintrusive load monitoring (NILM) is a problem of identifying operating appliances and estimating their energy consumptions based on whole home electric signals. Machine learning concepts and methods have been gradually applied to tackle NILM. A key factor of enabling and advancing machine learning methods in any problem is the availability of proper databases. The Reference Energy Disaggrega...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1508.00703 شماره
صفحات -
تاریخ انتشار 2015